Overview

Dataset statistics

Number of variables12
Number of observations782
Missing cells314
Missing cells (%)3.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory73.4 KiB
Average record size in memory96.2 B

Variable types

NUM10
CAT2

Reproduction

Analysis started2020-08-31 11:32:16.237102
Analysis finished2020-08-31 11:32:56.208395
Duration39.97 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

country_mapped has a high cardinality: 164 distinct values High cardinality
score is highly correlated with rankHigh correlation
rank is highly correlated with scoreHigh correlation
dystopia has 312 (39.9%) missing values Missing
country_mapped is uniformly distributed Uniform
rank is uniformly distributed Uniform

Variables

country_mapped
Categorical

HIGH CARDINALITY
UNIFORM

Distinct count164
Unique (%)21.0%
Missing0
Missing (%)0.0%
Memory size6.1 KiB
Bolivia
 
5
New Zealand
 
5
Nicaragua
 
5
Tunisia
 
5
Nigeria
 
5
Other values (159)
757
ValueCountFrequency (%) 
Bolivia50.6%
 
New Zealand50.6%
 
Nicaragua50.6%
 
Tunisia50.6%
 
Nigeria50.6%
 
United Kingdom50.6%
 
Finland50.6%
 
Italy50.6%
 
Argentina50.6%
 
Pakistan50.6%
 
Other values (154)73293.6%
 

Length

Max length24
Median length7
Mean length8.164961637
Min length4

region
Categorical

Distinct count10
Unique (%)1.3%
Missing1
Missing (%)0.1%
Memory size6.1 KiB
Sub-Saharan Africa
195
Central and Eastern Europe
145
Latin America and Caribbean
111
Western Europe
105
Middle East and Northern Africa
96
Other values (5)
129
ValueCountFrequency (%) 
Sub-Saharan Africa19524.9%
 
Central and Eastern Europe14518.5%
 
Latin America and Caribbean11114.2%
 
Western Europe10513.4%
 
Middle East and Northern Africa9612.3%
 
Southeastern Asia445.6%
 
Southern Asia354.5%
 
Eastern Asia303.8%
 
North America101.3%
 
Australia and New Zealand101.3%
 
(Missing)10.1%
 

Length

Max length31
Median length18
Mean length21.31585678
Min length3

year
Real number (ℝ≥0)

Distinct count5
Unique (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2016.9936061381075
Minimum2015
Maximum2019
Zeros0
Zeros (%)0.0%
Memory size6.1 KiB

Quantile statistics

Minimum2015
5-th percentile2015
Q12016
median2017
Q32018
95-th percentile2019
Maximum2019
Range4
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.417364432
Coefficient of variation (CV)0.0007027114157
Kurtosis-1.305269806
Mean2016.993606
Median Absolute Deviation (MAD)1
Skewness0.005903894403
Sum1577289
Variance2.008921934
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
201515820.2%
 
201615720.1%
 
201915619.9%
 
201815619.9%
 
201715519.8%
 
ValueCountFrequency (%) 
201515820.2%
 
201615720.1%
 
201715519.8%
 
201815619.9%
 
201915619.9%
 
ValueCountFrequency (%) 
201915619.9%
 
201815619.9%
 
201715519.8%
 
201615720.1%
 
201515820.2%
 

rank
Real number (ℝ≥0)

HIGH CORRELATION
UNIFORM

Distinct count158
Unique (%)20.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean78.69820971867007
Minimum1
Maximum158
Zeros0
Zeros (%)0.0%
Memory size6.1 KiB

Quantile statistics

Minimum1
5-th percentile8.05
Q140
median79
Q3118
95-th percentile149
Maximum158
Range157
Interquartile range (IQR)78

Descriptive statistics

Standard deviation45.18238438
Coefficient of variation (CV)0.5741221375
Kurtosis-1.199701126
Mean78.69820972
Median Absolute Deviation (MAD)39
Skewness0.0004973514565
Sum61542
Variance2041.447859
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
5760.8%
 
8260.8%
 
3460.8%
 
14560.8%
 
5050.6%
 
5650.6%
 
5550.6%
 
5450.6%
 
5350.6%
 
5250.6%
 
Other values (148)72893.1%
 
ValueCountFrequency (%) 
150.6%
 
250.6%
 
350.6%
 
450.6%
 
550.6%
 
ValueCountFrequency (%) 
15810.1%
 
15720.3%
 
15640.5%
 
15550.6%
 
15450.6%
 

score
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count716
Unique (%)91.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.379017902998669
Minimum2.69300007820129
Maximum7.769
Zeros0
Zeros (%)0.0%
Memory size6.1 KiB

Quantile statistics

Minimum2.693000078
5-th percentile3.58715
Q14.50975
median5.322
Q36.1895
95-th percentile7.31395
Maximum7.769
Range5.075999922
Interquartile range (IQR)1.67975

Descriptive statistics

Standard deviation1.12745646
Coefficient of variation (CV)0.2096026599
Kurtosis-0.7610545866
Mean5.379017903
Median Absolute Deviation (MAD)0.846
Skewness0.03585943327
Sum4206.392
Variance1.27115807
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
5.83530.4%
 
6.37530.4%
 
6.37930.4%
 
2.90530.4%
 
5.12930.4%
 
4.3530.4%
 
5.8930.4%
 
5.19230.4%
 
4.3620.3%
 
4.35620.3%
 
Other values (706)75496.4%
 
ValueCountFrequency (%) 
2.69300007810.1%
 
2.83910.1%
 
2.85310.1%
 
2.90499997110.1%
 
2.90530.4%
 
ValueCountFrequency (%) 
7.76910.1%
 
7.63210.1%
 
7.610.1%
 
7.59410.1%
 
7.58710.1%
 

gdp_pc
Real number (ℝ≥0)

Distinct count742
Unique (%)94.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9160474824829717
Minimum0.0
Maximum2.096
Zeros5
Zeros (%)0.6%
Memory size6.1 KiB

Quantile statistics

Minimum0
5-th percentile0.208797
Q10.6065
median0.9822047088
Q31.236187109
95-th percentile1.487882078
Maximum2.096
Range2.096
Interquartile range (IQR)0.629687109

Descriptive statistics

Standard deviation0.4073401313
Coefficient of variation (CV)0.4446714161
Kurtosis-0.6927595054
Mean0.9160474825
Median Absolute Deviation (MAD)0.2998890486
Skewness-0.3185805094
Sum716.3491313
Variance0.1659259826
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
050.6%
 
0.9640.5%
 
0.33230.4%
 
1.3430.4%
 
0.30820.3%
 
1.22120.3%
 
0.64220.3%
 
1.09220.3%
 
1.01720.3%
 
0.27420.3%
 
Other values (732)75596.5%
 
ValueCountFrequency (%) 
050.6%
 
0.015310.1%
 
0.0160410.1%
 
0.0226431842910.1%
 
0.02410.1%
 
ValueCountFrequency (%) 
2.09610.1%
 
1.87076568610.1%
 
1.8242710.1%
 
1.74194359810.1%
 
1.6975210.1%
 

family
Real number (ℝ≥0)

Distinct count732
Unique (%)93.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.0783924825069788
Minimum0.0
Maximum1.644
Zeros5
Zeros (%)0.6%
Memory size6.1 KiB

Quantile statistics

Minimum0
5-th percentile0.46133
Q10.8693625
median1.124735
Q31.32725
95-th percentile1.522
Maximum1.644
Range1.644
Interquartile range (IQR)0.4578875

Descriptive statistics

Standard deviation0.3295483193
Coefficient of variation (CV)0.305592189
Kurtosis0.1584486833
Mean1.078392483
Median Absolute Deviation (MAD)0.235555
Skewness-0.6846322898
Sum843.3029213
Variance0.1086020948
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
050.6%
 
1.12530.4%
 
1.4130.4%
 
1.46530.4%
 
1.43830.4%
 
1.50430.4%
 
1.53830.4%
 
1.48720.3%
 
1.36920.3%
 
1.30120.3%
 
Other values (722)75396.3%
 
ValueCountFrequency (%) 
050.6%
 
0.1041910.1%
 
0.1103710.1%
 
0.1399510.1%
 
0.14710.1%
 
ValueCountFrequency (%) 
1.64410.1%
 
1.62410.1%
 
1.61057400710.1%
 
1.60110.1%
 
1.59210.1%
 

health
Real number (ℝ≥0)

Distinct count705
Unique (%)90.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.612415577116253
Minimum0.0
Maximum1.141
Zeros5
Zeros (%)0.6%
Memory size6.1 KiB

Quantile statistics

Minimum0
5-th percentile0.1578945
Q10.4401825
median0.6473095147
Q30.808
95-th percentile0.954973
Maximum1.141
Range1.141
Interquartile range (IQR)0.3678175

Descriptive statistics

Standard deviation0.2483086404
Coefficient of variation (CV)0.4054577474
Kurtosis-0.487571207
Mean0.6124155771
Median Absolute Deviation (MAD)0.1686446949
Skewness-0.5012025622
Sum478.9089813
Variance0.06165718089
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0.81550.6%
 
0.99950.6%
 
050.6%
 
0.82840.5%
 
0.87440.5%
 
0.87130.4%
 
0.85430.4%
 
0.86130.4%
 
0.88430.4%
 
0.80830.4%
 
Other values (695)74495.1%
 
ValueCountFrequency (%) 
050.6%
 
0.00556475389710.1%
 
0.0110.1%
 
0.018772685910.1%
 
0.0382410.1%
 
ValueCountFrequency (%) 
1.14110.1%
 
1.12210.1%
 
1.08810.1%
 
1.06210.1%
 
1.05210.1%
 

freedom
Real number (ℝ≥0)

Distinct count697
Unique (%)89.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4110908258223149
Minimum0.0
Maximum0.7240000000000001
Zeros5
Zeros (%)0.6%
Memory size6.1 KiB

Quantile statistics

Minimum0
5-th percentile0.128096
Q10.3097675
median0.431
Q30.531
95-th percentile0.6308885
Maximum0.724
Range0.724
Interquartile range (IQR)0.2212325

Descriptive statistics

Standard deviation0.1528804206
Coefficient of variation (CV)0.3718896434
Kurtosis-0.3072054061
Mean0.4110908258
Median Absolute Deviation (MAD)0.10948
Skewness-0.5212591254
Sum321.4730258
Variance0.02337242301
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
050.6%
 
0.55740.5%
 
0.45430.4%
 
0.33430.4%
 
0.40630.4%
 
0.39430.4%
 
0.31230.4%
 
0.49830.4%
 
0.43130.4%
 
0.53130.4%
 
Other values (687)74995.8%
 
ValueCountFrequency (%) 
050.6%
 
0.0058910.1%
 
0.0110.1%
 
0.01310.1%
 
0.0149958552810.1%
 
ValueCountFrequency (%) 
0.72410.1%
 
0.69610.1%
 
0.68610.1%
 
0.68310.1%
 
0.68110.1%
 

trust
Real number (ℝ≥0)

Distinct count635
Unique (%)81.3%
Missing1
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean0.12543561357358715
Minimum0.0
Maximum0.55191
Zeros6
Zeros (%)0.8%
Memory size6.1 KiB

Quantile statistics

Minimum0
5-th percentile0.018
Q10.054
median0.091
Q30.15603
95-th percentile0.37124
Maximum0.55191
Range0.55191
Interquartile range (IQR)0.10203

Descriptive statistics

Standard deviation0.1058164476
Coefficient of variation (CV)0.8435917404
Kurtosis1.880108294
Mean0.1254356136
Median Absolute Deviation (MAD)0.04745
Skewness1.5208882
Sum97.9652142
Variance0.01119712057
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0.08270.9%
 
060.8%
 
0.03460.8%
 
0.06460.8%
 
0.02860.8%
 
0.07860.8%
 
0.05650.6%
 
0.07450.6%
 
0.05550.6%
 
0.09350.6%
 
Other values (625)72492.6%
 
ValueCountFrequency (%) 
060.8%
 
0.00110.1%
 
0.0022710.1%
 
0.0032210.1%
 
0.00410.1%
 
ValueCountFrequency (%) 
0.5519110.1%
 
0.5220810.1%
 
0.5052110.1%
 
0.492110.1%
 
0.4835710.1%
 

generosity
Real number (ℝ≥0)

Distinct count664
Unique (%)84.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.21857584156082985
Minimum0.0
Maximum0.8380751609802249
Zeros5
Zeros (%)0.6%
Memory size6.1 KiB

Quantile statistics

Minimum0
5-th percentile0.05403037467
Q10.13
median0.2019822115
Q30.2788325
95-th percentile0.4704543383
Maximum0.838075161
Range0.838075161
Interquartile range (IQR)0.1488325

Descriptive statistics

Standard deviation0.1223207487
Coefficient of variation (CV)0.5596261135
Kurtosis2.020258278
Mean0.2185758416
Median Absolute Deviation (MAD)0.07322149014
Skewness1.044360015
Sum170.9263081
Variance0.01496236557
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0.17560.8%
 
050.6%
 
0.18750.6%
 
0.15350.6%
 
0.19740.5%
 
0.09940.5%
 
0.08340.5%
 
0.2230.4%
 
0.13730.4%
 
0.28530.4%
 
Other values (654)74094.6%
 
ValueCountFrequency (%) 
050.6%
 
0.0019910.1%
 
0.0101646566810.1%
 
0.0202510.1%
 
0.02510.1%
 
ValueCountFrequency (%) 
0.83807516110.1%
 
0.8197110.1%
 
0.7958810.1%
 
0.611704587910.1%
 
0.59810.1%
 

dystopia
Real number (ℝ≥0)

MISSING

Distinct count470
Unique (%)100.0%
Missing312
Missing (%)39.9%
Infinite0
Infinite (%)0.0%
Mean2.092716638021185
Minimum0.32858000000000004
Maximum3.83772
Zeros0
Zeros (%)0.0%
Memory size6.1 KiB

Quantile statistics

Minimum0.32858
5-th percentile1.126492703
Q11.737975
median2.09464
Q32.455574545
95-th percentile3.025559
Maximum3.83772
Range3.50914
Interquartile range (IQR)0.7175995455

Descriptive statistics

Standard deviation0.5657717565
Coefficient of variation (CV)0.2703527779
Kurtosis0.4141306299
Mean2.092716638
Median Absolute Deviation (MAD)0.3595132296
Skewness-0.1216469469
Sum983.5768199
Variance0.3200976805
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1.8330210.1%
 
3.3102910.1%
 
2.2664610.1%
 
1.3475910.1%
 
1.9937510.1%
 
2.4036410.1%
 
1.5988810.1%
 
1.32291626910.1%
 
1.78489255910.1%
 
2.4380110.1%
 
Other values (460)46058.8%
 
(Missing)31239.9%
 
ValueCountFrequency (%) 
0.3285810.1%
 
0.377913713510.1%
 
0.419389247910.1%
 
0.540061235410.1%
 
0.554633140610.1%
 
ValueCountFrequency (%) 
3.8377210.1%
 
3.6021410.1%
 
3.5590610.1%
 
3.5073310.1%
 
3.4090410.1%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

Sample

First rows

country_mappedregionyearrankscoregdp_pcfamilyhealthfreedomtrustgenerositydystopia
0AfghanistanSouthern Asia20151533.5750.3198200.3028500.3033500.2341400.0971900.3651001.952100
1AfghanistanSouthern Asia20161543.3600.3822700.1103700.1734400.1643000.0711200.3126802.145580
2AfghanistanSouthern Asia20171413.7940.4014770.5815430.1807470.1061800.0611580.3118712.150801
3AfghanistanSouthern Asia20181453.6320.3320000.5370000.2550000.0850000.0360000.191000NaN
4AfghanistanSouthern Asia20191543.2030.3500000.5170000.3610000.0000000.0250000.158000NaN
5AlbaniaCentral and Eastern Europe2015954.9590.8786700.8043400.8132500.3573300.0641300.1427201.898940
6AlbaniaCentral and Eastern Europe20161094.6550.9553000.5016300.7300700.3186600.0530100.1684001.928160
7AlbaniaCentral and Eastern Europe20171094.6440.9961930.8036850.7311600.3814990.0398640.2013131.490442
8AlbaniaCentral and Eastern Europe20181124.5860.9160000.8170000.7900000.4190000.0320000.149000NaN
9AlbaniaCentral and Eastern Europe20191074.7190.9470000.8480000.8740000.3830000.0270000.178000NaN

Last rows

country_mappedregionyearrankscoregdp_pcfamilyhealthfreedomtrustgenerositydystopia
772ZambiaSub-Saharan Africa2015855.1290.4703800.9161200.2992400.4882700.1246800.1959102.634300
773ZambiaSub-Saharan Africa20161064.7950.6120200.6376000.2357300.4266200.1147900.1786602.589910
774ZambiaSub-Saharan Africa20171164.5140.6364071.0031870.2578360.4616030.0782140.2495801.826705
775ZambiaSub-Saharan Africa20181254.3770.5620001.0470000.2950000.5030000.0820000.221000NaN
776ZambiaSub-Saharan Africa20191384.1070.5780001.0580000.4260000.4310000.0870000.247000NaN
777ZimbabweSub-Saharan Africa20151154.6100.2710001.0327600.3347500.2586100.0807900.1898702.441910
778ZimbabweSub-Saharan Africa20161314.1930.3504100.7147800.1595000.2542900.0858200.1850302.442700
779ZimbabweSub-Saharan Africa20171383.8750.3758471.0830960.1967640.3363840.0953750.1891431.597970
780ZimbabweSub-Saharan Africa20181443.6920.3570001.0940000.2480000.4060000.0990000.132000NaN
781ZimbabweSub-Saharan Africa20191463.6630.3660001.1140000.4330000.3610000.0890000.151000NaN